AITopics | training language gan

Training Language GANs from Scratch

Neural Information Processing SystemsDec-25-2025, 20:21:40 GMT

Generative Adversarial Networks (GANs) enjoy great success at image generation, but have proven difficult to train in the domain of natural language. Challenges with gradient estimation, optimization instability, and mode collapse have lead practitioners to resort to maximum likelihood pre-training, followed by small amounts of adversarial fine-tuning. The benefits of GAN fine-tuning for language generation are unclear, as the resulting models produce comparable or worse samples than traditional language models. We show it is in fact possible to train a language GAN from scratch --- without maximum likelihood pre-training. We combine existing techniques such as large batch sizes, dense rewards and discriminator regularization to stabilize and improve language GANs. The resulting model, ScratchGAN, performs comparably to maximum likelihood training on EMNLP2017 News and WikiText-103 corpora according to quality and diversity metrics.

maximum likelihood pre-training, name change, training language gan, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Reviews: Training Language GANs from Scratch

Neural Information Processing SystemsJan-26-2025, 07:04:51 GMT

I've raised my score accordingly, but I still think that there needs to be more solid results. In particular, while the rebuttal notes that ScratchGAN can almost match the MLE baseline, I am not sure how strong the MLE baseline itself is. Based on sample quality, I suspect that the MLE baseline itself is quite weak and does not use more modern LM approaches (e.g. Of course, I am not saying that the authors deliberately used weak baselines, but it would be helpful to compare against stronger MLE baselines too. Weaknesses: - The main weakness is empirical---scratchGAN appreciably underperforms an MLE model in terms of LM score and reverse LM score.

baseline, mle baseline, training language gan, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Reviews: Training Language GANs from Scratch

Neural Information Processing SystemsJan-26-2025, 07:04:40 GMT

This paper has required quite a bit of discussion between the reviewers. The concerns were that each individual technique proposed in the paper has been tried in the past. However, their combination enabled something which has not been shown before: training a decent text GAN model without MLE pre-training. While the submission does not provide a convincing argument for switching from MLE to GAN in text generation, it is still an important paper. While some may question where the text GAN direction will ever deliver state-of-the-art language generation models, it is an active area.

training language gan

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.66)

Add feedback

Training Language GANs from Scratch

Neural Information Processing SystemsOct-10-2024, 16:40:43 GMT

Generative Adversarial Networks (GANs) enjoy great success at image generation, but have proven difficult to train in the domain of natural language. Challenges with gradient estimation, optimization instability, and mode collapse have lead practitioners to resort to maximum likelihood pre-training, followed by small amounts of adversarial fine-tuning. The benefits of GAN fine-tuning for language generation are unclear, as the resulting models produce comparable or worse samples than traditional language models. We show it is in fact possible to train a language GAN from scratch --- without maximum likelihood pre-training. We combine existing techniques such as large batch sizes, dense rewards and discriminator regularization to stabilize and improve language GANs.

fine-tuning, maximum likelihood pre-training, training language gan

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Training Language GANs from Scratch

d', Autume, Cyprien de Masson, Mohamed, Shakir, Rosca, Mihaela, Rae, Jack

Neural Information Processing SystemsMar-18-2020, 22:15:54 GMT

Generative Adversarial Networks (GANs) enjoy great success at image generation, but have proven difficult to train in the domain of natural language. Challenges with gradient estimation, optimization instability, and mode collapse have lead practitioners to resort to maximum likelihood pre-training, followed by small amounts of adversarial fine-tuning. The benefits of GAN fine-tuning for language generation are unclear, as the resulting models produce comparable or worse samples than traditional language models. We show it is in fact possible to train a language GAN from scratch --- without maximum likelihood pre-training. We combine existing techniques such as large batch sizes, dense rewards and discriminator regularization to stabilize and improve language GANs.

fine-tuning, maximum likelihood pre-training, training language gan

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Filters

Collaborating Authors

training language gan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Training Language GANs from Scratch

Reviews: Training Language GANs from Scratch

Reviews: Training Language GANs from Scratch

Training Language GANs from Scratch

Training Language GANs from Scratch